Classification of Protein Interaction Sentences via Gaussian Processes
نویسندگان
چکیده
The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a nonparametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and näıve Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption.
منابع مشابه
Protein interaction detection in sentences via Gaussian Processes: a preliminary evaluation
The non-parametric deterministic Support Vector Machines (SVMs) produce high levels of performances in text classification. This article offers a much needed evaluation of the Gaussian Process (GP) classifier, as a non-parametric probabilistic analogue to SVMs, which has been rarely applied to text classification. We provide an extensive experimental comparison of the performance and properties...
متن کاملJoint Emotion Analysis via Multi-task Gaussian Processes
We propose a model for jointly predicting multiple emotions in natural language sentences. Our model is based on a low-rank coregionalisation approach, which combines a vector-valued Gaussian Process with a rich parameterisation scheme. We show that our approach is able to learn correlations and anti-correlations between emotions on a news headlines dataset. The proposed model outperforms both ...
متن کاملThe Rate of Entropy for Gaussian Processes
In this paper, we show that in order to obtain the Tsallis entropy rate for stochastic processes, we can use the limit of conditional entropy, as it was done for the case of Shannon and Renyi entropy rates. Using that we can obtain Tsallis entropy rate for stationary Gaussian processes. Finally, we derive the relation between Renyi, Shannon and Tsallis entropy rates for stationary Gaussian proc...
متن کاملComplete convergence of moving-average processes under negative dependence sub-Gaussian assumptions
The complete convergence is investigated for moving-average processes of doubly infinite sequence of negative dependence sub-gaussian random variables with zero means, finite variances and absolutely summable coefficients. As a corollary, the rate of complete convergence is obtained under some suitable conditions on the coefficients.
متن کاملImproved prediction of missing protein interactome links via anomaly detection
Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of finding undiscovered links in a network is also referred to as the link prediction problem. A popular a...
متن کامل